A Joint Segmenting and Labeling Approach for Chinese Lexical Analysis
نویسندگان
چکیده
This paper introduces an approach which jointly performs a cascade of segmentation and labeling subtasks for Chinese lexical analysis, including word segmentation, named entity recognition and partof-speech tagging. Unlike the traditional pipeline manner, the cascaded subtasks are conducted in a single step simultaneously, therefore error propagation could be avoided and the information could be shared among multi-level subtasks. In this approach, Weighted Finite State Transducers (WFSTs) are adopted. Within the unified framework of WFSTs, the models for each subtask are represented and then combined into a single one. Thereby, through one-pass decoding the joint optimal outputs for multi-level processes will be reached. The experimental results show the effectiveness of the presented joint processing approach, which significantly outperforms the traditional method in pipeline style.
منابع مشابه
The effects of task complexity on Chinese learners’ language production: A synthesis and meta-analysis
The present meta-analysis was conducted to provide a quantitative measure of the overall effects of task complexity on Chinese EFL learners’ language production. Based on the strict inclusion criteria, 12 primary studies were synthesized according to key features. Eleven of them were meta-analyzed to investigate effects of raising the resource-directing task comple...
متن کاملImproving Chunk-based Semantic Role Labeling with Lexical Features
We present an approach for Semantic Role Labeling (SRL) using Conditional Random Fields in a joint identification/classification step. The approach is based on shallow syntactic information (chunks) and a number of lexicalized features such as selectional preferences and automatically inferred similar words, extracted using lexical databases and distributional similarity metrics. We use semanti...
متن کاملExploring Impacts of Consciousness-raising in a Genre-based Pedagogy
This study reports on the findings of a genre teaching course for developing academic writing of a class of EFL students in Iran. The information report genre was taught in a cyclical way of teaching and learning, which was started from ‘setting the context’ and ‘deconstruction’ of prototype information report genre, and continued with ‘joint construction’, ‘independent construction’, and final...
متن کاملExamining the Effect of Ideology and Idiosyncrasy on Lexical Choices in Translation Studies within the CDA Framework
Using a critical discourse analytic model of translation criticism, the present study attempts to explore the effect of ideology and idiosyncrasy on the lexical choices in translation studies. The study employed a descriptive approach to answer two research questions: Is there any relationship between ideology and idiosyncratic features of translators' lexical choices? And if yes, can it be ana...
متن کاملA Heuristic Method for Chinese Segmentation
Research and development in digital library includes content creation, conversion, indexing, organization, and dissemination, where the key technological issues are how to search and display desired selections from and across large collections effectively [10]. A repository is an indexed collection of objects. Indexing is an important task for searching. The better the indexing, the better the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008